Predicting the Output of Solar Photovoltaic Panels in the Absence of Weather Data Using Only the Power Output of the Neighbouring Sites

There is an increasing need for capable models in the forecast of the output of solar photovoltaic panels. These models are vital for optimizing the performance and maintenance of PV systems. There is also a shortage of studies on forecasts of the output power of solar photovoltaics sites in the absence of meteorological data. Unlike common methods, this study explores numerous machine learning algorithms for forecasting the output of solar photovoltaic panels in the absence of weather data such as temperature, humidity and wind speed, which are often used when forecasting the output of solar PV panels. The considered models include Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Recurrent Neural Network (RNN) and Transformer. These models were used with the data collected from 50 different solar photo voltaic sites in South Korea, which consist of readings of the output of each of the sites collected at regular intervals. This study focuses on obtaining multistep forecasts for the multi-in multi-out, multi-in uni-out and uni-in uni-out settings. Detailed experimentation was carried out in each of these settings. Finally, for each of these settings and different lookback and forecast lengths, the best models were also identified.


Introduction
The popularity of solar photovoltaic (PV) panels as a source of renewable energy has been at an all time high [1]. The sky-high demand for electricity and the environmental effects of non-renewable energy sources have made energy sources such as solar more desirable [2]. This surge in the number of PV systems installed globally escalates the need for optimization of the performance and minimization of the cost of these systems. For optimization and cost minimization of PVs, forecasting the output of these systems is crucial.
Weather conditions, specially solar radiance, temperature, humidity, and wind speed, highly affect the output of the PV sites [3]. These weather factors are also important in forecasting the output of PV sites. So, it is comprehensible that most of the studies related to forecasting the output of the PV sites focus on weather data [4,5]. However, there may be instances in which weather data are not available. An issue in data collection due to faulty sensors or equipment, power outages and network failures can also result in missing or inaccurate data. Meteorological stations may also be located at a significantly larger distance from solar PV sites, which might cause inaccurate representation of the local weather conditions. Additionally, historical meteorological data may not be available, making it laborious to perform long-term forecasting. Furthermore, the PV sites may be located at remote sites where meteorological stations may be difficult to operate due to the limited infrastructure [6]. Finally, collection and the processing of the meteorological data may be expensive and infeasible for small-scale PV sites with low budgets for operation [7].

Time Series Forecasting
A time series analysis has been a long-standing challenge and has been widely investigated in the past. A range of traditional methods have been developed to tackle the problem, including Hidden Markov Models [8]; Kalman filters [9]; and statistical methods such as ARIMA [10], exponentially weighted moving averages [11,12] and vector autoregressors [13]. Deep learning techniques have also been applied to time series forecasting (TSF) with Recurrent Neural Networks (RNNs) being the predominant architecture due to their ability to model sequential data [14][15][16][17]. Recently, temporal convolutional networks (TCNs) have gained popularity in the field. In combination with RNNs, graph neural networks (GNNs) have also been used to capture spatial and temporal patterns in data [18][19][20][21][22][23]. With the advent of transformer [24] architectures, RNNs have been replaced in many applications of sequential modeling. The self-attention mechanism has been a key factor in the success of transformers, although the quadratic computation and memory complexity associated with self-attention is problematic for longer sequences. Consequently, the focus of most transformer architectures has been on developing more efficient designs that employ sparser query matrices to compute self-attention [25,26]. Recent developments in deep learning for time series forecasting have integrated classical concepts with modern deep learning techniques, achieving promising results [27,28]. For instance, Autoformer decomposes the original time series into seasonality and trend components and then extracts the dependencies using an autocorrelation block [27]. SCINet employs a multi-resolution analysis approach that combines downsampling techniques, convolutions and a unique interaction block to capture the dependencies in the data [28].

Forecasting of Solar PV Power
There has been immense interest on forecasting the output of solar PV plants. The methods used range from traditional statistical models to simple machine learning models and the latest deep-learning-based techniques.
Most of the studies dealing with the forecast of the output of the solar PV plants utilize the meteorological data related to the respective sites. A study compared various regression techniques ranging from linear least squares to support vector machines (SVM) with different kernel functions in weather forecasts to predict hourly solar intensity [29]. Similarly, another work studied SVM models while differing in their dimensionality reduction techniques for forecasting PV power [30]. A simple neural network has also previously been suggested to forecast the global horizontal irradiance (GHI) and direct normal irradiance (DNI) using weather forecasts as predictors [31]. They utilized the Genetic Algorithm (GA) and Gamma test for initial feature selection and utilized various neural network structures to perform forecasting. A different study used only the endogenous variables to obtain forecasting models such as Autoregressive Integrated Moving Average (ARIMA), k-Nearest-Neighbours (KNN), Neural Networks and neural networks optimized by Genetic Algorithms to predict hourly PV generation with a prediction horizon up to 2 h [32]. A comparison between various statistical and machine learning models such as SVM, Binary regression Trees, Random Forest (RF), gradient boosted regression trees (GRBT) and Generative Additive Models (GAM) was presented for performing 1 day ahead hourly PV power generation forecasts for some powerplants in France [33,34]. The influence of spatial and temporal information on solar power generation was studied in a different work using gradient boosting alongside vector autoregressive model and compared it with the autoregressive model [35]. Probabilistic forecasting of solar power generation has also been performed using quantile regression forests [34,36].
Taking these challenges into account, it is essential that substitute propositions are studied for the forecasting of the output power of PV sites in the absence of meteorological data. In this study, the feasibility of utilizing the output power readings of multiple solar PV sites located near one another, for multi step forecasting of the output power of these sites, was studied. The forecasts were performed in various settings: (i) utilizing the data of a single site for both input and output (univariate input and univariate output), (ii) utilizing the data of multiple sites as the input and that of a single site as the output (multivariate input and univariate output), and (iii) utilizing the data of multiple sites for both input and output (multivariate input and multivariate output). Numerous machine learning models were used to perform these forecasts, including Long Short-Term Memory (LSTM) [37], Recurrent Neural Network (RNN) [38], Gated Recurrent Unit (GRU) [39] and Transformer [24]. The data used for this study came from 41 solar PV sites located in Suncheon, South Korea. It included the readings of the output of the panels recorded at a regular interval of 15 min, over a period of 6 months, The main contributions of this paper include the following: • A study of the feasibility of forecasting solar PV outputs in the absence of meteorological data. • Utilizing popular deep learning models for the forecast of the solar PV output for optimization of the performance and minimization of the maintenance costs of PV sites. • Identifying an appropriate method for the forecast of solar PV output at various forecasting lengths. • Suggesting a suitable workflow for the forecasting of solar PV outputs in scenarios where meteorological data are unavailable and under three different settings: the multivariate, univariate and multi-in uni-out settings.

Methods
The initial steps of this study were data collection and pre-processing. The data obtained from these steps are defined in Section 3.1. After obtaining the data, the next step prior to making forecasts was training the forecasting models. The models used for training are described in Section 3.2. Finally, the trained models were tested on the unseen data, and the models performing the best were deployed. The results obtained from this step are described in Section 4. All of these steps can be visualized in Figure 1. A view of the location of some PV sites used in this study is shown in Figure 2.

Data Description
The data used in this study came from 50 solar photovoltaic (PV) sites located in Suncheon, South Korea. The data included readings of the output of the PV panels taken at regular intervals of 15 min over a period of 6 months from 1 January 2020 to 31 July 2020. The data were collected in a time series format, with readings taken every 15 min. A total of 11,142 data points were used. A small portion of the used dataset has been shown in the Table 1.
The obtained data were pre-processed, and the first pre-processing step was dealing with missing data points and removing inconsistencies. For example, during nighttime, there should not be any PV output at all, so the data during night time were set to zero if otherwise found. The missing data were also replaced by the average of nearby data points. Similarly, it was possible that poor results could be obtained if the dataset was inconsistent, so, if for any sites it was found that a huge chunk of data was missing, then such sites were discarded. Originally, the dataset contained readings from 50 different sites. However, during the preprocessing step, the data from 10 of the sites were removed due to the presence of inconsistencies and huge intervals of missing data. After this, the data were split into train, test, and validation data and scaled.
It is important to note that this study does not utilize weather data, such as temperature, humidity and wind speed, which are often used in forecasting the output of solar PV panels. This highlights the need for alternative approaches when weather data are not available.
In this study, I focus on multi-step time series forecasting, where the goal is to predict the output of the PV panels based on the readings at multiple time steps in the past. Despite the absence of weather data, this study aims to demonstrate the feasibility of using alternative approaches to forecast the output of solar PV panels.

Used Forecasting Models
For the purpose of this study, I used four popular machine learning models to perform forecasting of the solar PV output. The used models are Recurrent Neural Network (RNN) [38], Gated Recurrent Unit (GRU) [39], Long Short-Term Memory (LSTM) [37], and Transformer [24].

Recurrent Neural Network (RNN) [38]
A Recurrent Neural Network (RNN) is a type of artificial neural network that is well suited for processing sequential data. RNNs are used for tasks such as natural language processing and speech recognition, where the input data have a temporal structure. RNNs maintain an internal hidden state that can capture information from the entire sequence of inputs up to a given time step. This allows RNNs to model long-term dependencies in sequential data.
RNNs consist of a series of interconnected nodes or neurons that are connected in a feedforward manner. The hidden state of the network is updated at each time step based on the previous hidden state and the input data at that time step. This allows RNNs to model the dependencies between the data at different time steps, which is crucial for processing sequential data.
The main drawback of traditional RNNs is that they are prone to the vanishing gradient problem, where the gradient of the error signal with respect to the network parameters decreases exponentially as it propagates through time. This makes it difficult to train RNNs on long sequences, as the gradient of the error signal becomes very small, making it difficult to update the network parameters. To overcome this, various variants of RNNs have been developed, such as LSTMs (Long Short-Term Memory) and GRUs (Gated Recurrent Units), which are more robust to the vanishing gradient problem. The architecture of a general RNN is shown in Figure 3.
The equation for an RNN model at time step t can be expressed as follows: where h t is the hidden state at time step t, x t is the input at time step t, y t is the output at time step t, W hx and W hh are weight matrices, b h is the bias vector for the hidden layer, W yh is the weight matrix and b y is the bias vector for the output layer. f and g are activation functions, which are often the hyperbolic tangent or the rectified linear unit (ReLU) function. Figure 3. The architecture of a simple RNN unrolled.

Gated Recurrent Unit (GRU) [39]
Gated Recurrent Units (GRUs) are a type of Recurrent Neural Network (RNN) designed to capture the long-term dependencies between time steps in sequential data. Unlike traditional RNNs, which have a simple linear activation function to capture the relationships between time steps, GRUs have a gating mechanism that allows them to selectively choose which information to preserve from previous time steps and which to discard.
The GRU has two hidden states, the reset gate and the update gate, which are used to control the flow of information from previous time steps. The reset gate decides how much of the previous hidden state is to be forgotten, while the update gate decides how much of the previous hidden state is to be combined with the current input. The final hidden state is then used to predict the output at the current time step.
The GRU's gating mechanism allows it to efficiently handle long sequences of data, as it can selectively preserve the most relevant information and discard the rest. Additionally, GRUs require fewer parameters than traditional RNNs, which can reduce overfitting and improve model training efficiency.
GRUs have been applied to various time series forecasting problems, including solar PV output forecasting. In such applications, the GRU is trained on historical time series data to capture the relationships between time steps and then used to make predictions for future time steps. The input to the GRU is the time series data, and the output is a prediction for the future values of the time series. The architecture of a single block of GRU is shown in Figure 4.
The update gate, reset gate and new memory cell vector of a Gated Recurrent Unit (GRU) can be computed using the following equations: where x t is the input at time step t; h t−1 is the hidden state at the previous time step; W z , U z and b z are the weights and bias for the update gate; W r , U r and b r are the weights and bias for the reset gate; W, U and b are the weights and bias for computing the new memory cell vector; z t is the update gate output; r t is the reset gate output;h t is the candidate hidden state; and h t is the current hidden state. The σ function is the sigmoid function, and is the element-wise product (Hadamard product).

Long Short-Term Memory (LSTM) [37]
Long Short-Term Memory (LSTM) is a type of Recurrent Neural Network (RNN) architecture that is specifically designed to overcome the vanishing gradient problem faced by traditional RNNs. LSTM is widely used for time-series analysis, sequential prediction and natural language processing tasks.
LSTMs consist of memory cells that store information for an extended period of time and gates that control the flow of information into and out of the memory cells. The three gates in LSTM are the input gate, forget gate and output gate. These gates help in deciding what information should be stored in the memory, what information should be discarded and what information should be outputted.
The input gate controls the amount of new information that is allowed to enter the memory cell. The forget gate decides what information should be discarded from the memory cell. The output gate controls what information should be outputted from the memory cell.
LSTMs are trained using backpropagation through time (BPTT) to minimize a loss function that represents the difference between the predicted output and the true output. The weights of the gates and the memory cells are updated during the training process to minimize the loss function.
LSTMs are able to capture long-term dependencies in time-series data and to outperform traditional RNNs in tasks that require memory and sequential prediction. They have been widely used for time-series forecasting, natural language processing and speech recognition. The architecture of a single block of LSTM is shown in Figure 5.
The equations for the different gates of LSTM are as follows: where f t is the forget gate, i t is the input gate,C t is the candidate cell state, C t is the cell state, o t is the output gate, h t is the hidden state at time t, x t is the input at time t, W and b are the weight and bias matrices, and σ is the sigmoid activation function. Figure 5. The architecture of a single block of LSTM.

Transformer [24]
Transformer is a neural network architecture that was introduced in 2017 by Vaswani et al. in the paper "Attention is All You Need" [24]. Transformer is designed to handle sequential data and has revolutionized the field of natural language processing (NLP). The key innovation of Transformer is the self-attention mechanism, which allows the model to weigh the importance of each feature in the input sequence when making predictions. The original transformer and its variants have also been used in multiple time series forecasting applications [24,40].
A traditional Recurrent Neural Network (RNN) operates on a sequence by processing one element at a time while maintaining an internal state. In contrast, the Transformer operates on the entire sequence at once, allowing it to capture long-term dependencies between elements. This capability of Transformer makes it well suited for sequence-tosequence problems, such as time series forecasting.
The Transformer architecture consists of an encoder and a decoder, both of which are made up of a series of stacked attention and feed-forward layers. The encoder takes the input sequence and computes a sequence of hidden states. The decoder then takes the hidden states and produces the output sequence.
The attention mechanism in Transformer computes a weight for each element in the input sequence, indicating its importance in the current prediction. These weights are used to compute a weighted sum of the hidden states, which is then used to make the prediction. This allows Transformer to focus on the most relevant parts of the input sequence when making predictions.
In addition to the self-attention mechanism, Transformer also uses multi-head attention, which allows the model to attend to multiple aspects of the input sequence at once. This allows the model to capture complex relationships between elements in the input sequence. Figure 6 shows the architecture of a transformer.
The self-attention in the Transformer architecture is given by the following equation: This equation calculates the output of the self-attention mechanism, which is used to compute the relationship between each position in the input sequence. Here, Q represents the queries, K represents the keys and V represents the values. The equation calculates the dot product of the queries and keys, which are divided by the square root of the dimensionality of the keys, to scale the gradients. The result is passed through a softmax activation function to obtain a probability distribution over the keys, which is then used to weight the values.
The multi-head attention in the transformer architecture is given by the following equation: where This equation applies the self-attention mechanism multiple times in parallel to capture different relationships between the input sequence elements. Here, Q, K and V are the same as in the self-attention equation but are projected using weight matrices W Q i , W K i and W V i to create h different subspaces, or "heads". The self-attention operation is then applied to each of these subspaces to produce h different outputs, which are concatenated and linearly transformed using a weight matrix W O to produce the final output.

Forecast Settings
The forecasts were performed in three different settings, namely multi-in multi-out, multi-in uni-out and uni-in uni-out. For all three of these settings, the lookback length (LBL) is the number of past data points used as input and forecast horizon length (FHL) is the number of datapoints into the future that forecasts was made. Multi-in muli-out is the scenario where, for fixed values of LBL and FHL, the forecasts for multiple sites are made using the input past data from each of those sites. This is useful when multiple site are to be monitored and managed simultaneously. Similarly, in the multi-in uni-out setting for fixed values of LBL and FHL, the forecasts for a single site are made using past input data from multiple neighbouring sites. The multi-in uni-out setting is useful if the goal is to obtain longer term forecasts of a single site with more accuracy. Finally, in the uni-in uni-out setting for fixed values of LBL and FHL, the forecasts for a single site are made using past input data from the same site. This setting is useful when only single-site short-term forecasts are required and resources are not available for multi-in uni-out settings. These resources might be computational resources or data. In terms of computational resources, processing multiple site data points requires more computation as compared with processing single-site data points. In terms of data, it is possible that only a single PV site is in operation or that the user may not have access to other neighbouring sites data. These three settings have been visualized in Figure 7.

Experiment and Results
The possibility of forecasting solely using only the historical recordings of the solar PV output in the absence of any kind of weather information was tested in these experiments. Famous deep learning architectures-RNN, GRU, LSTM and Transformer-were compared here in terms of mean square error (MSE) and mean absolute error (MAE) while making forecasts.

Implementation Details
The data were split into train, test and validation sets at a ratio of 8:1:1. All of the models were then trained with L2 loss, which is the squared difference between the actual value and the prediction value, using the ADAM [41] optimizer. The code used was implemented using the PyTorch framework, and the the experiments were performed on a single NVIDIA GeForce RTX 3060 laptop GPU.
Furthermore, the experiments were performed in three different setups: (i) the multi-in multi-out setting, (ii) the multi-in uni-out setting and (iii) the uni-in uni-out setting.

Details of Hyper-Parameters Used
The details of the hyper-parameters used are shown in Table 2. The hyper-parameter values are same for RNN, GRU and LSTM for all settings. The learning rate was set very low to 1 × 10 −5 as the models seemed to overfit even in very few epochs when set to a value higher than this. Additionally, due to the same reason, each configuration of the models used were also very simple. The number of RNN, GRU and LSTM layers used was 2 and the number of encoders and decoders in Transformer was set to 3. The weight decay was set to 1 × 10 −6 , and the batch size was set to a fixed value of 64 for each of the settings. The number of epochs for Transformer used was 50 as compared with 100 for the rest of the models because Transformer was overfitting very early.

Evaluation Metrics
The evaluation metrics used to evaluate the performance of the forecasting models are mean square error (MSE) and mean absolute error (MAE).

Mean Square Error (MSE)
MSE is a common metric used to evaluate the accuracy of a regression model. It measures the average squared difference between the predicted values and the actual values. The formula for calculating MSE is as follows: where n is the number of observations, y is the actual value andŷ is the predicted value. Better performance is obtained during forecasting when the MSE is lower. It amplifies the effect of a larger error, and the model will be more sensitive to them.

Mean Absolute Error (MAE)
MAE is another metric commonly used to evaluate regression models. It measures the average absolute difference between the predicted values and the actual values. The formula for calculating MAE is as follows: where n is the number of observations, y is the actual value andŷ is the predicted value. MAE has the same unit as the predicted output and is more interpretable. Additionally, the performance of the model is better when the MAE value is lower. MAE is less sensitive to outliers and is easy to understand.

Results of Multi-In Multi-Out Setting
The comparative results obtained for the multi-in multi-out setting for each of the used forecasting models are shown in Table 3. From Table 3, it can be deduced that Transformer performs arguably the best in the multivariate predictive scenario. This can be attributed to the better ability of Transformer to realize the individual relationships between each of the input features. As can be seen from Table 3, Transformer outperforms each of the other architectures by a huge margin. The RNN, GRU and LSTM models perform almost similarly when the length of the lookback window is 48 or 96, with either GRU or LSTM mostly performing better than the simple RNN model. The performance of LSTM is much better as compared with that of RNN and GRU when the lookback window length was 144 and the forecast horizon was 24. Furthermore, while for a short FHL length of 4 or 8, the performances are comparable for all LBLs, it can even be said that the performance when a shorter LBL, 48, is used is better. However, as the FHL becomes longer (12 or 24), the performance improves as the FHL increases. Therefore, Table 3 suggests that, when proper resources are available, the use of Transformer for the multivariate setting would provide the best results. However, architectures such as LSTM and GRU can also be used if required, since the forecasting results are acceptable in these cases as well. Additionally, for a shorter FHL, using a shorter LBL is sufficient; however, for a longer FHL, a longer LBL is required.  Table 4 gives the comparative results for the forecasting models in the multi-in uniout setting. Table 4 shows that the LSTM and Transformer models go head to head in terms of different settings. The LSTM has better MAE values in most of the cases in this setting and Transformer has better MSE values with all the lookback window lengths and forecast horizon lengths. However, in the multi-in multi-out setting, there is little difference in the performance of the best and the worst performing models in each of the settings. Additionally, like in multi-in multi-out for a short FHL of 4 or 8, the performances are comparable for all LBLs, and it can even be said that, when a shorter LBL of 48 is used, the performance is better. However, as the FHL becomes longer (12 or 24), the performance improves as the FHL increases. Therefore, in this scenario, it can be suggested that RNN, GRU or LSTM can be used, considering the fact that there is no major difference in performance in each of the four models and the fact that Transformer requires much larger computing resources as compared with rest of the models. Furthermore, like in the multi-in multi-out setting with a shorter FHL, using a shorter LBL is sufficient; however, for a longer FHL, a longer LBL is required.

Results of Uni-In Uni-Out Setting
The results of the uni-in uni-out forecasting of the power of a solar pv is shown in Table  5. In this setting, Transformer mostly dominates the other models in terms of performance. The rest of the models performed better than the Transformer model only occasionally. For example, RNN performed better than the rest of the models in terms of mean squared error when the lookback length was 48 and the forecast horizon was 4. Similarly, GRU performed better than the rest of the models in terms of mean squared error when the lookback length was 48 and the horizon was 12 and when the lookback length was 96 and the horizon was 4. Finally, LSTM performed better than the rest of the models both in terms of mean squared error and mean average error when the lookback length was 48 and the forecast horizon was 8 and only in terms of mean squared error when the lookback length was 96 and the forecast horizon was 4. Like in the previous two settings for a shorter FHL, the performances are comparable for different LBLs. For RNN, the performance for an FHL of 4 was even better when the LBL was 48 compared with an LBL of 96 or 144. However, for a longer FHL to be forecasted, the results are better when a longer LBL is used.

Comparative Analysis of Results Obtained
The three forecast settings: multi-in multi-out, multi-in uni-out and uni-in uni-out are useful in their own regards. A visual comparison of Tables 3-5 shows that these settings play different roles when the FHLs are varied for each LBL and each model. The simplest of the three settings, the uni-in uni-out setting, seems to be much more effective than the rest when the required FHL is short (4 or 8). Similarly, the multi-in uni-out setting seems to be more effective when the FHLs are longer (12 or 24). So, either the multi-in uni-out or uni-in uni-out setting can be used based on the FHL required, when the forecast of only a single site is required. Finally the results of mutli-in multi-out are better than that of the multi-in uni-out setting when the FHL is 4 or 8 but not as good as the uni-in uni-out setting for the same FHL. For a longer FHL the results of the multi-in multi-out setting are not as good as that of the other two. However, the multi-in multi-out setting is still useful when the forecasts of all the sites are required at once without having to generate the forecast of individual sites separately.
The plots for the output by each of the used forecasting models are presented in Figure  8. The plots in Figure 8 show that the forecast does not always follow the exact patterns of the actual output. However, the forecasts are actually really close to the real output values. Moreover, the plots for the Transformer model seem to model the output patterns better than the rest of the models.

Conclusions
Unlike previously performed studies on solar PV output forecasting, this study has opted to perform forecasts solely based on historical reading of the PV output values and without considering weather information. This study has been performed so as to confirm whether the forecast of solar PV output values in such conditions is suitable. The data used were collected from multiple solar PV sites in Suncheon, South Korea, at regular intervals for a duration of 6 months. The forecasting models used for the study are RNN, GRU, LSTM and Transformer. This study was performed in three different settings (multivariate, multi-in uni-out and univariate). For each of the different settings, Transformer seemed to have performed better most of the time. However, it is necessary to consider that the Transformer model requires more computing resources as compared to the rest of the models. Therefore, it is best that a proper forecasting model is selected based on the requirements and existing constraints such as the availability of resources and data. For instance, the results for the multi-in uni-out setting were not very different for each of the models, so if there a possible resource constraint, then the less powerful models such as LSTM, GRU or even RNN can be considered. Furthermore, it is necessary that the data in use are well processed and that there are very few anomalies in them. If these steps are properly performed, then the forecasting requirements of solar PV power plant outputs can be achieved even in the absence of meteorological data.
In future work, various existing architectures can be studied to perform solar output forecasts and a novel architecture can be suggested. Longer term forecasts can also be studied for detailed system implementation for load management. Finally, the forecasted results can also be used to detect anomalies in the system by comparing the actual output of the plants with forecasted outputs. Acknowledgments: I am grateful to Ranjai Baidya for his helpful discussion regarding this paper and the help during the preparation of this manuscript.

Conflicts of Interest:
The author declares no conflicts of interest.

Abbreviations
The following abbreviations are used in this manuscript: